Smoothing Effects of Bagging: Von Mises Expansions of Bagged Statistical Functionals
نویسندگان
چکیده
Bagging is a device intended for reducing the prediction error of learning algorithms. In its simplest form, bagging draws bootstrap samples from the training sample, applies the learning algorithm to each bootstrap sample, and then averages the resulting prediction rules. We extend the definition of bagging from statistics to statistical functionals and study the von Mises expansion of bagged statistical functionals. We show that the expansion is related to the Efron-Stein ANOVA expansion of the raw (unbagged) functional. The basic observation is that a bagged functional is always smooth in the sense that the von Mises expansion exists and is finite of length 1 + resample size M . This holds even if the raw functional is rough or unstable. The resample size M acts as a smoothing parameter, where a smaller M means more smoothing. Statistics Department, The Wharton School, University of Pennsylvania, Philadelphia, PA, 19104-6340. Department of Statistics, University of Washington, Seattle, WA 98195-4322; [email protected]. Research partially supported by NSF grant DMS 9803226. 1 Notations, Definitions and Assumptions for Bagging Statistical Functionals We need some standard notations and assumptions in order to define bagging for statistics and, more generally, for statistical functionals. Let θ be a real-valued statistical functional θ(F ) : P → IR defined on a subset P of the probability measures on a given sample space. By assumption all empirical measuress FM = 1 M ∑M i=1 δxi are contained in P . If θ is evaluated at an empiricial measure, it specializes to a statistic which we write as θ(FM) = θ(x1, . . . , xM). This is a permutation symmetric function of the M sample points. In what follows we will repeatedly need expectations of random variables θ(X1, . . . , XM) where X1, . . . , XM are i.i.d. according to some F : E F θ(X1, . . . , XM) = ∫ θ(x1, . . . , xM) dF (x1) · · · dF (xM) Following Breiman (1996), we define bagging of a statistic θ(FN) as the average over bootstrap samples X∗ 1 , . . . , X ∗ N drawn i.i.d. from FN : θ(FN) = E FN θ(X ∗ 1 , . . . , X ∗ N) . For our purposes we need to generalize the notion of bagging to statistical functionals θ(F ). First, we need to divorce the resample size from the sample size N (compare Friedman and Hall 2000, Wu, Goetze, Bickel et al, ... (add more)). To this end, we allow the number M of resamples to be drawn from FN to be arbitrary: θ M(FN) = E FN θ(X ∗ 1 , . . . , X ∗ M) . Note that M is totally independent of N ; in particular M can be smaller or larger than N . This separation of M and N allows us to extend the definition of bagging from empirical measures FN to arbitrary distributions: θ M(F ) = E F θ(X ∗ 1 , . . . , X ∗ M) , where the random variables X∗ 1 , . . . , X ∗ M are i.i.d. F , and their number M is merely a parameter of the bagging procedure. Unlike for an empirical distribution of an actual sample, for a general probability measure F there is no notion of sample size. The variables X∗ i should still be thought of as bootstrap samples, albeit drawn from an “infinite population”. Since the resample size M now denotes a parameter of the bagging procedure, we need to distinguish it from the size N of actual data x1, . . . , xN . If one models the data as i.i.d. samples from F , one estimates F with the empirical FN = 1 N ∑N i=1 δxi . The functional θ(F ) is then estimated by plug-in with the statistic θ(FN): θ̂(F ) = θ(FN) .
منابع مشابه
Smoothing Effects of Bagging
Bagging is a device intended for reducing the prediction error of learning algorithms. In its simplest form, bagging draws bootstrap samples from the training sample, applies the learning algorithm to each bootstrap sample, and then averages the resulting prediction rules. We study the von Mises expansion of a bagged statistical functional and show that it is related to the Stein-Efron ANOVA ex...
متن کاملBagging Down-Weights Leverage Points
Bagging is a procedure averaging estimators trained on bootstrap samples. Numerous experiments have shown that bagged estimates often yield better results than the original predictor, and several explanations have been given to account for this gain. However, six years from its introduction, bagging is still not fully understood. Most explanations given until now are based on global properties ...
متن کاملNonparametric Conditional Estimation
Many nonparametric regression techniques (such as kernels, nearest neighbors, and smoothing splines) estimate the conditional mean of Y given X = z by a weighted sum of observed Y values, where observations with X values near z tend to have larger weights. In this report the weights are taken to represent a finite signed measure on the space of Y values. This measure is studied as an estimate o...
متن کاملParameter Estimation of Some Archimedean Copulas Based on Minimum Cramér-von-Mises Distance
The purpose of this paper is to introduce a new estimation method for estimating the Archimedean copula dependence parameter in the non-parametric setting. The estimation of the dependence parameter has been selected as the value that minimizes the Cramér-von-Mises distance which measures the distance between Empirical Bernstein Kendall distribution function and true Kendall distribution functi...
متن کاملEffects of Bagging and Bias Correction on Estimators Defined by Estimating Equations
Bagging an estimator approximately doubles its bias through the impact of bagging on quadratic terms in expansions of the estimator. This difficulty can be alleviated by bagging a suitably bias-corrected estimator, however. In these and other circumstances, what is the overall impact of bagging and/or bias correction, and how can it be characterised? We answer these questions in the case of gen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016